Trust within human-machine collectives depends on the perceived consensus about cooperative norms

With the progress of artificial intelligence and the emergence of global online communities, humans and machines are increasingly participating in mixed collectives in which they can help or hinder each other. Human societies have had thousands of years to consolidate the social norms that promote cooperation; but mixed collectives often struggle to articulate the norms which hold when humans coexist with machines. In five studies involving 7917 individuals, we document the way people treat machines differently than humans in a stylized society of beneficiaries, helpers, punishers, and trustors. We show that a different amount of trust is gained by helpers and punishers when they follow norms over not doing so. We also demonstrate that the trust-gain of norm-followers is associated with trustors’ assessment about the consensual nature of cooperative norms over helping and punishing. Lastly, we establish that, under certain conditions, informing trustors about the norm-consensus over helping tends to decrease the differential treatment of both machines and people interacting with them. These results allow us to anticipate how humans may develop cooperative norms for human-machine collectives, specifically, by relying on already extant norms in human-only groups. We also demonstrate that this evolution may be accelerated by making people aware of their emerging consensus.

Each experimental condition is denoted by "S" that stands for sharing, signaling that the Helper is matched with the Trustor in the trust game, furthermore the identity of participants in the third-party punishment game are signaled as, either a "P," a person, or a "B," a bot in parenthesis after the role they fulfill: P1 (Helper), P2 (Beneficiary), and P3 (Punisher), respectively.  Table S6: The relationship between the Trustor's beliefs of norm-consensus over the Punisher's punishing behavior and the trust the Punisher gains from punishing in Study 3. Each experimental condition is denoted by "P" that stands for punishment, signaling that the Punisher is matched with the Trustor in the trust game, furthermore the identity of participants in the third-party punishment game are signaled as, either a "P," a person, or a "B," a bot in parenthesis after the role they fulfill: P1 (Helper), P2 (Beneficiary), and P3 (Punisher), respectively.    (1), which they summarize in their Table 1. The within-person design (Study 4) is most susceptible to time-effects, such as maturation (natural change in subjects' views over the trust placed in norm-followers, non-followers or both which impact the trust-gain between the two time-points when participants are observed) or history (learning from previous experiences which could impact the trust-gain). The between-person design is most susceptible to selection effects (minimized by random assignment, but may be subject to differential attrition which is specifically addressed in Supplementary Note 13) and differential time-effects (these are also minimized as a result of random assignment as treatment and control groups were measured at the same time).
To address some of the shortcomings of the within-person design of Study 4, we performed the following robustness check. We include a question in the experiment that asks about how 22 the two studies (Studies 2 and 4) compare (specifically by asking: "You have been invited to this study as a result of your participation in a study earlier. How do you think these two studies compare?") with options presented on a 4-point scale ranging from the studies being identical to being completely different, and with a fifth option of "I don't know as I do not remember the details of the previous study." In order to alleviate concerns about recall, we carry out our analysis only on participants who disclosed they did not remember the study (N = 172, 59.93% of the sample), and those who believed that the two studies were different (N = 21, 7.31% of the sample). In sum, we drop participants who believed the studies were identical (which technically they were not, but they only differed in a single sentence, N = 10, 3.48% of the sample), and those who thought the studies were similar (N = 84, 29.27% of the sample).
We re-estimate the comparisons reported in Figure 4 in the main paper, on the smaller sample described above, and present these in Figure S11, which offer similar substantive conclusions.
Last but not least, we investigate the biases specific to Study 5, which center on selective attrition. We group these results together with similar investigations in the other studies and discuss them in Supplementary Note 13.

Meta Analysis: Pooling the Estimates of Studies 4 and 5
To bolster our quest for robustness, we also conduct a meta-analysis of Studies 4 and 5. Doing so, we follow closely Morris and DeShon who discuss how effect size estimates may be combined across different studies that employ different designs (1). They suggest that: "metaanalysis on effect sizes from alternate designs can be performed using standard procedures, as long as (a) the effect sizes are first transformed into a common metric and (b) the appropriate sampling variance formulas are used when estimating the mean and testing for homogeneity of effect size" (ibid, p.119). We undertake these steps consecutively.
First, we transform the effect sizes into a common metric on the basis of equations (11) and (12) from (1) which give the following cross-walk: leading to an estimate for the effect size based on the estimate of change scores d RM , expressed in raw score change, where ρ is the correlation between pre-and post-test scores. In our case, ρ is the correlation between the trust-gain with and without the norm-manipulation in Study 4, following Morris and DeShon who state that "an aggregate of the correlational data across the single-group pretest-posttest designs provides the best estimate of the population correlation" (ibid, p.120).
Second, we compute the sampling variances defined in Table 2 in (1) for the two studies separately.

24
The sampling variance for Study 4 is: where n is the number of paired observations, δ RM is the population effect size in the change score metric, specifically δ RM = µ D,E σ D,E and σ D,E is the standard deviation of the change scores, and µ D,E is the mean of the change scores. Additionally, c(.) is the bias function, where c(df ) = 1 − 3 4df −1 , and df = n − 1, where n continues to refer to the number of paired observations. The sampling variance for Study 5 is: whereñ = n E n C /(n E + n C ) and by n E the size of the experimental group is meant, in our case those who received information about the norm consensus, and n C the size of the experimental group who did not receive the norm-consensus information; and N is the combined number of observations, n E +n C . Additionally, c(.) is the bias function, where c(df ) = 1− 3 4df −1 , and df = n E + n C − 2. Note that the population effect size is replaced by its estimate when calculating the sampling variance, and the estimate is taken to be the simple (unweighted) average of the effect sizes across the two studies. Now that we have the effect sizes expressed in the same metric, we may combine these by weighting the estimates from the individual studies by the reciprocal of the sampling variance to provide the most accurate estimates (the summations include only two terms for Studies 4 and 5):δ where w i are given above and are the reciprocal of the sampling variance: 1/Var RM and 1/Var IG , respectively for the two studies. Note that the population effect size is replaced by its estimate when calculating the sampling variance, and the estimate is taken to be the simple (unweighted) average of the effect sizes across the two studies. Supplementary Table S10: The trust gain of the bot Helper as a result of the normconsensus manipulation. n T stands for the size of the group that received the normmanipulation, and n C stands for the size of the group that did not receive the normmanipulation; d is the effect size estimate; c is the value of the bias function; Var IG and Var RM are the sampling variances computed on the basis of equations (2) and (3) and the ws stand for the respective weights used to weight the effect size estimates. Supplementary Table S11: The trust gain of the Helper sharing with a bot Beneficiary as a result of the norm-consensus manipulation. n T stands for the size of the group that received the norm-manipulation, and n C stands for the size of the group that did not receive the normmanipulation; d is the effect size estimate; c is the value of the bias function; Var IG and Var RM are the sampling variances computed on the basis of equations (2) and (3) and the ws stand for the respective weights used to weight the effect size estimates.
For the test of homogeneity, we obtain the observed variance (the numerator is 2, since we are focusing on two studies): where w i are given above and are the reciprocal of the sampling variance: 1/Var RM and 1/Var IG , respectively for the two studies.
We also obtain the variance due to sampling error, which is estimated from the weighted average of the individual study variances: whereσ 2 e i is the sampling variance as defined in (2) and (3). The effect size is viewed as homogeneous when Q = kσ 2 d /σ 2 e is checked against a χ 2 distribution with k − 1 degrees of freedom, where k is the number of studies, in this case, 2, and the null-hypothesis cannot be rejected. Alternatively, Smith and Hunter suggest to calculate the ratioσ 2 d /σ 2 e > 0.75, and combine estimates, i.e., accept them as homogeneous, when the 75% rule is met.
We produce a comparable table to Table 3 in (1), separately for both of the experimental conditions determined by the identity of players in Stage 1, and present all the metrics in Tables S10-S11 for the sake of completeness. From therefore Player 2 in the context of our paper is the Beneficiary, while Player 1 is the Helper.
Responses were first coded to establish a concise set of themes. After developing this coding scheme, two independent coders coded all responses (one pair coded Studies 2 and 5, while another pair coded Study 5), then compared their codes, discussed and adjudicated between their answers, agreeing on unified codes which we analyze. The independent coders were research assistants who were not familiar with the hypotheses tested in the study and how these data would be deployed to avoid any conscious or unconscious biases.
Participants were not aware that we would ask them to justify their decisions when they have made them. This approach alleviates concerns that participants would avoid decisions, such as not sharing their resources with a person, that they deem inappropriate or would make them feel judged, but that they would in fact engage in in short of such pressures. While for these reasons we believe decisions themselves do not carry such biases, justifications might simply be a result of motivated reasoning. For example, participants may have simply wanted to maximize their bonus and act selfishly, but they might have justified their selfish decision by highlighting that 29 they were in dire need of the money and that they were short on bills. These pressures apply to all experimental conditions; while it is possible that the severity of these pressures vary by the identity of the players signaled to participants: participants might have been less likely to hide selfish reasons when Beneficiaries were bots, for example.

Helpers' Justifications
To classify Helpers' justifications, the following 12 codes were developed and applied for all responses. Codes are not exclusive, as many justifications (45.7%) contain more than one rea- 1a I wanted to avoid punishment.
1b I chose to take/avoid a risk: I may/may not get punished.
1c Referencing the way in which Punishers make their decision.
2 The decision was made on the basis of the identity of the Beneficiary (either because they were a bot or because they were a person).
2a Given who the Beneficiary is, they may need/not need the money.
2b Given who the Beneficiary is, their feelings could be/could not be hurt.

30
3 The decision was made to impress the Trustor.
3a Specifically, to have the Trustor to think of the Helper as a nice/fair/trustworthy person.
3b Specifically, to ensure that the Trustor sends more of their resources to the Helper. 4 The decision was made on the basis of some higher-level or universal principle. E.g., to ensure "equality" or because the participant is a fair/moral/ethical person and/or their actions were a fair/generous/moral/ethical thing to do.
5 The decision was made on the basis of the identity (either universal or temporal characteristic) of the Helper as a "nice person" or a "person in need." 6 The decision was made to maximize the Helper's monetary gain without giving an indication if it is in reference to stage 1, stage 2, or the combination.
7 The decision was made on the basis of the reciprocity principle: the Helper treated the Beneficiary in such a way they would expect to be treated in their shoes.
8 The decision was made with the owner of the bot in mind.
9 The participant was confused (most often did not believe the identity of players signaled).
9a Based on the Helper's justification, the participant misunderstood the rules of the game in some way. E.g., they indicated that the money they would send to Beneficiary would be doubled.
9b Based on the Helper's justification, the participant misunderstood who has information about the identities of the players, which was explicitly signaled to all decision makers.

31
10 The reason is rooted in what all players should do.

11
The decision was made because the Trustor cannot be influenced.
12 The justification did not meet any of the above classifications.
In Table S12 we provide a typical response for each code; and Table S13 provides the distribution of justifications across experimental conditions. We find support for the assertions that Helpers consider Trustors' decisions when they decide whether or not to share their resources with the Beneficiaries. These strategic considerations are more prevalent when Beneficiaries and Punishers are people (23.8%), compared to when Beneficiaries are bots (19.7%). In fact, impressing the Trustor to send more money was the second most common reason mentioned in the human-only condition, while in the condition when the Beneficiary was a bot this was only the sixth most common. In substantive terms, Helpers are highly concerned about what their behavior might mean to Trustors when they interact with other humans, but these considerations do not disappear, as they do try to ponder over the meaning of their actions in the eyes of others, even when they interact with bots. While this aspect of signaling has been the main focus of our argument, helping norms (which are implied by Helpers' concerns about the punishment they might receive), also figured into Helpers' decision making. Specifically, Helpers wanted to avoid punishment when they were paired with human Beneficiaries (9.8%), while they reasoned this way as well when they were paired with a bot Beneficiary (10.8%). References to higher-level principles were made much more frequently when Beneficiaries were people (47.0%) compared to when Beneficiaries were bots (21.0%). Many participants stated simply that they made their decision because the Beneficiary was a bot without giving much further detail and that bots did not need money. These considerations clearly crowd out, but do not eliminate, concerns about punishment or the desire to impress Trustors, or thoughts of higherlevel principles, such as fairness. In sum, signaling motives and higher-level principles/ethics dominate Helpers' justifications, albeit with variation by condition.
Code Example answer 1 I could keep all of my money depending on the choices of player 3 1a I did not want to be punished by player 3 for not sharing. 1b I will take the risk of player three punishing me if it means a chance at keeping the money. If they don't punish me, then I keep all the money, but if they do, i would end up with 50 cents which is the same as if i would have shared with player 2. 1c I decided to keep the money because I don't know what Player 3 will do. 2 I decided not to share with Player 2 because it was a bot. Or: I think I should share even though the player is a robot. I.e., referencing the identity of the Beneficiary led to both outcomes. 2a I didn't share with Player 2 because they were a bot, and they wouldn't really benefit at all by my sharing. 2b I felt that the bot had no feelings so it wasn't amoral to keep the money.
3 I wanted it to reflect positively on the next stages. 3a I would hope player 4 would think I was generous and would return half the money to him or her. 3b I am willing to share to better my chances of a better share of the pot in the next stage.
4 It is the right thing to do. Or: I like to be equal and share the wealth. There is no reason I should not share the money. ipants volunteered explanations that point to them being selfish/greedy/self-interested as well as being nice/fair/kind. 6 My decision was based on the goal to maximize my payout. 7 I believe that most participants would make the same decision as me, if they were in my place. 8 At first I thought it would be futile to share with a bot because no one would really benefit but then I figured someone must own the bot and so I would be sharing with them. 9 I am also assuming these are bots and not actual people. 9a I hope player two sends me back .75 cents. The Beneficiary made no decisions in the game, therefore, they had no opportunity to send any money back. 9b So while player 3 probably wouldn't punish me for not sending to a bot, player 4 doesn't know it's a bot and might just think I'm selfish. Then again, maybe I remember wrong and player 4 knows player 2 was a bot. but I think since player 4 only joins later they aren't informed. 10 if we can all help each other out I think it'd be advantageous to all of us. 11 I didn't share with Player 2 because I think that Player 4 won't care either way if I shared or not and will probably keep their bonus to themselves anyway. 12 I decided how much money I wanted in to end. Table S12: Typical answers of Helpers matching each code introduced. Note that in the experiment the language of Beneficiary (Player 2), Helper (Player 1), Punisher (Player 3) and Trustor (Player 4) were avoided. E.g., Helpers may have shared more if Players 2 were called "Beneficiary" as it suggest that they "need" to benefit etc. Supplementary Table S13: The distribution of reasons for making a decision by the Helper. The first column contains the reasons, the second the share of times that a specific reason was mentioned in the human-only condition, the third the share of times that a specific reason was mentioned in the condition when the Beneficiary was a bot.

Trustors' Justifications
To classify Trustors' justifications, 13 codes were developed and applied for all responses. Since our study used the strategy method, justifications were also categorized to reflect that some participants reasoned through their decisions when Helpers shared (A); when Helpers did not share (B); both when Helpers shared and when they did not (C); and in some cases based on the response it was unclear which of these Trustors justified (D). Which decision (sharing vs. not) was referred to showed some variation across experimental condition, but no consistent patterns emerged across studies.
Codes are not exclusive as many justifications contain more than one reason ( 1 The decision rested on the principle of "consistency" of behavior, expressing what Helpers did in Stage 1 (helping/not helping), they will do in Stage 2 (sending money back/not sending money back).
2 The reasoning is rooted in considering the risk involved (taking/not taking a risk).
3 The decision is rested in the identity of the Trustor: being a fair/moral/ethical person and or regarding the action of sending/not sending money as fair/moral/ethical act. I.e., referencing some higher or universal principle, e.g., equality, or some situational element: the Trustor is a person in need of money.
4 The decision was motivated by wanting to reward/punish the Helper.
5 The decision was made on the basis of the identity of the Helper (either because they were a bot/or because they were a person).
5a The Helper needs/does not need money/makes no sense to give money to them. 8 The Trustor aimed to maximize their bonus (without further referencing any of the other justifications, e.g., hoping the Helper would share their resources without referencing that they would be consistent, or any other reason). 9 The decision was made based on the Helper's behavior without giving any further details (i.e., the Trustor differentiated, but did not explain why, simply restating what the decision was, not why it was made).
10 The decision was rooted in the Beneficiary's identity.

11
The participant was confused.
11a The justification gave an indication that the Trustor misunderstood the rules of the game in some way. E.g., assumed that non-sharing Helpers are always punished; or assumed that they did not know how the Helper decided earlier (which was explicitly signaled to them).
11b The justification gave an indication that the Trustor misunderstood who has information about the identities of the players, which was explicitly signaled to all decision makers.
11c The Trustor did not believe the identity of whom they were paired with (bot/person).
12 The decision was made so that the Helper would "do the right thing" in the future.
13 The justification did not meet any of the above classifications.
In Table S14 we provide some typical responses for each code. Followed by this, we make several comparisons. Specifically, we start by outlining how the distribution of justifications in Study 2 differed across experimental conditions (Table S15). We find strong support for the assertion that Trustors considered Helpers' decisions, and reflected mostly on the consistency they anticipated of the Helpers' actions across Stage 1 and Stage 2, which varied with experimental condition. Importantly, in the human-only condition 41.6% referred to this principle, and while this reasoning was still the most prominent one, participants referred to consistency of the bot Helper only in 31.3% of cases and intuited such consistency of Helpers 27.9% of the time when they were paired with bot Beneficiaries. In sum, the strength of the signal about Helpers diminishes considerably when they were paired with bot Beneficiaries in the eyes of Trustors.' These justifications also allude to norms, and while (understandably) Trustors do not use the language employed in the scholarly literature, a few participants expressed that they wished to reward or punish Helpers. These terms reference norm enforcement, at twice the rate in the humanonly condition (8.3%) over the condition when the Helper is a bot (4.3%), with the condition where the Beneficiary is a bot in the middle (6%). Not surprisingly, the Helper's identity figures into Trustors' justifications at much higher rates when the Helper is a bot (12.3%) compared to when they are a person (0.6% and 1.7%, respectively), and most of the Trustors' concerns focus on the way in which Helpers make decisions (asserting that they make them randomly in 2.1% of the time, and expressing that they do not know how they make them/how the bots were programmed in 11.7% of the time). This is a crucially important observation: while Trustors 37 mention that bots do not need money (2.5%), nor do they have feelings (1.5%), they focus on the uncertainty of what meaning to attach to their Stage 1 decisions.
We now turn to examining the same frequencies over justifications among participants who believed the norm signal in Study 4 (Table S16). The comparison between justifications with the norm signal and without it is substantively important. When participants receive the norm signal, an additional 6.0%. of them think of consistency when the Helper is a bot, and an additional 10.0%. of them when the Beneficiary is a bot. While there is a slight shift (5.2%.) in the prominence of consistency in the human-only condition, the gap between this and other conditions in this regard shrinks. From these descriptive analyses, it appears that norm signals clarify the meaning of the behavior of Helpers. Last but not least, Trustors ponder the way bot Helpers make decisions, not their need of money or lack of feelings.
38 Code Example answer 1 Based on the past decision of Player 1 I felt like if the player 1 shared in the past they are likely to send money back if I send them money now in stage 2. 2 I chose to send 100 cents in each scenario. Even though Player 1 did not share in Scenario 1, I am taking a gamble that they will split the profit with me. Scenario 2 seems less risky as they chose to share in the first stage. 3 I tried to send the amount that would give us both equal bonus amounts or close to it. 4 I sent more money in the scenario that player 1 did share with the other player. I thought it was a good thing to reward generosity. 5 I just felt like I couldn't trust a bot to do the right thing and understand the situation. 5a I didn't care to send any money to P1 because it's a bot. It's not going to get a bonus or be able to use one. 5b I chose not to send any money to player 1 because they are a bot and I don't think that they would be sensitive to humans and feel the need to be fair by sending money back if I chose to send them something. 5c Because player 1 is a computer, it feels more like rolling dice which I'm totally fine with. 5d Being a bot I am still unsure how it will decide in this round.
6 Everyone on Turk needs money, most of us are on here because life didn't go how we thought it would. If me sending money helps someone else I want to do it so we can all earn and pay our bills. 7 Based on how generous he was. 8 I based it on how likely I thought he was to return anything. I also wanted to be the safest for my return on the game. 9 Based on if he shared or not. 10 I agreed with their decision about sharing with the Bot. 11 I do not want to risk losing my money voluntarily. I know I could be punished, but it is a risk I am willing to take. In Stage 2, there is no Punisher; nor can the Helper they were paired with punish them when they "did not risk" any money. 10a Also, the rules for stage 2 said that player 4 was to be told how player 1 behaved in stage 1, which apparently is not the case? The responded did not understand the strategy method, and appeared to be confused having to make two decisions. 11b I am trying to maximize my money, hopefully Player 1 will pick up on this, but I am reliant on him/her/bot. Despite clearly signaling the identity of the Helper, this participant seems not to know this information. 11c I decided to keep my money no matter what. I have a suspicion the other MTurk worker isn't real. 12 I would in theory like to reward the one that shared, but at the same time I'm hoping that by sharing with the non-sharer, they will remember that kindness in the future. 13 100.
Supplementary Table S14: Typical answers of Turstors matching each code introduced. Note that in the experiment the language of Beneficiary (Player 2), Helper (Player 1), Punisher (Player 3) and Trustor (Player 4) were avoided. E.g., Helpers may have shared more if Players 2 were called "Beneficiary" as it suggest that they "need" to benefit etc.  Supplementary Table S15: The distribution of reasons for making a decision by the Trustor in Study 4. The first column contains the reasons, the second the share of times that a specific reason was mentioned in the human-only condition, the third the share of times that a specific reason was mentioned in the condition when the Helper was a bot, and the fourth the share of times that a specific reason was mentioned in the condition when the Beneficiary was a bot.  Supplementary Table S16: The distribution of reasons for making a decision by the Trustor in Study 4 among those who believed the norm. The first column contains the reasons, the second the share of times that a specific reason was mentioned in the human-only condition, the third the share of times that a specific reason was mentioned in the condition when the Helper was a bot, and the fourth the share of times that a specific reason was mentioned in the condition when the Beneficiary was a bot.   who have not received the norm signal, and those who did and believed it.
The first column contains the reasons, the second and fourth the share of times that a specific reason was mentioned in the human-only condition, the third and fifth the share of times that a specific reason was mentioned in the condition when the Helper was a bot, and the fourth and sixth the share of times that a specific reason was mentioned in the condition when the Beneficiary was a bot.
We now turn to examining the same frequencies over justifications among participants in Study 5, displaying the answers of those who did not receive the norm signal, and those who did, and believed it (Table S17). The differences are similarly telling as in the comparison between Studies 2 and 4. Mentions of consistency increase when receiving the norm signal, so are those of the characteristic of Helpers. Importantly, Study 2 and 4 were coded by the same pair of coders, while Study 5 by another pair. Upon further inspection, the two pairs resolved ambiguity between "consistency" code and "Helper characteristics" differently, the first pair being more generous asserting a consistency code. Potentially similar coder differences underlie the discrepancy between studies in terms of assessing how risk-taking or risk-aversion influenced Trustors decisions. Importantly, in Study 5 as well Trustors focused on how bots make decisions (e.g., randomly, or unclear how) rather than bots' need for money or their feelings. Taken together, despite likely variability across coders, the same qualitative differences are born out in the data: norm signals solidify the meaning that Trustors attach to Helpers' actions.
Taken together, these justifications do not appear to refer to a commonly discussed perspective in the behavioral economics literature that decision makers, in this case, Helpers or Trustors playing with bots consider the people responsible for the bot's design, or the financial implications for giving up resources to the bot for the researchers. In fact, only one of the Helpers (and not a single Trustor) mentioned this perspective. Additionally, while confusion about the rules of the two-stage game is always a theoretical possibility, in our experiments few participants gave indication of such confusion (never more than 4% of the participants per experimental condition while being extremely generous with identifying such reasons by including answers that simply expressed that the participant did not believe the experimenter about the identities of players signaled). These observations further bolster our assertion that our interpretations anchored on signaling are likely to be correct in this context.
There are some clear limitations of the qualitative data. The use of the strategy method 43 consistent with the design of Jordan and colleagues (3) creates a different situation compared to asking participants to make one trust decision (when paired with a norm-following, and when paired with a norm-breaking Helper). This feature clearly influenced how participants reasoned.
Additionally, since the study was survey-based, for the respondents who simply mentioned that a decision was made based on the identity of a player and mentioned that they were a bot, there were no options to ask the respondent to elaborate, specifically, if they thought of bots not needing money, not having feelings, or being unpredictable. Future work could be designed with the emerging themes documented here in mind, and with renewed emphasis on peoples' expectations over how bots make decisions.
Supplementary Note 9: Comparing the Distribution of Trustgain in Study 2 and Study 3 In Study 2 the trust-gain is measured based on the trust decisions of the Trustor with real monetary stakes, while this measure in Study 3 is based on hypothetical decisions. Since the main goal of Study 3 is to correlate ones' perceptions about the norm-consensus in a given situation and the trust-gain based on trust-decisions, both of these measures need to be collected from the same individuals. However, having the same people participate in the strategic game, and then immediately answer norm-related questions would have likely yielded biased responses. Here, the main risk is that participants' decisions might have influenced their norm-consensus assessments; e.g., those acting selfishly could have "guessed" that there is no consensus over norms to manage their impression in front of the experimenter, thereby introducing a correlation consciously or unconsciously. To avoid such a confound, we measure norm-consensus information first, and then have participants make trust decisions. However, this induces a discrepancy between the design of Study 2 (real-stakes decisions) and Study 3 (hypothetical decisions). Given this, we here compare the distribution of the trust-gain in these studies using two measures: the Bhattacharyya coefficient and η, the overlapping index.
The Bhattacharyya coefficient (BC) is a measure of similarity between two discrete probability distributions p and q over the same domain X: where values close to 1 suggest that two distributions are similar (with BC = 1, the distributions are identical), while values close to 0 indicate that the distributions are different.
The η is:
where f A (x) and f B (x) are two real probability density functions. The overlapping index η is R n × R n → [0, 1], and the integral can be replaced by summation in the discrete case (including the present case). Similarly to the Bhattacharyya coefficient, η close to 1 indicates that the distributions are similar, while η close to 0 indicates the opposite.
The result of the comparison by experimental condition is represented visually in Fig-ure S12, and numerically in Table S18. In this case the two measures generally agree (note slight differences in case of two of the punishment conditions), and suggest that the real-stakes and hypothetical decisions yielded similar distributions of the trust-gain.
Supplementary Figure S12: Comparing the distribution of trust-gain in Study 2 and Study 3. Each experimental condition is denoted by "S" that stands for sharing and "P" that stands for punishment signaling if the Helper or the Punisher is matched with the Trustor in the trust game, and the identity of participants in the third-party punishment game, either a "P," a person, or a "B," a bot in parenthesis after the role they fulfill: P1 (Helper), P2 (Beneficiary), and P3 (Trustor), respectively. Each violin-plot represents the distribution of standardized mean differences (SMDs) for the demographic variables presented in Table S7 across the treatment indicated in the row and column; while µ indicates the mean of SMDs. Each experimental condition is denoted by "S" that stands for sharing and "P" that stands for punishment that signals if the Helper or the Punisher is matched with the Trustor in the trust game, and the identity of participants in the third-party punishment game, either a "P," a person, or a "B," a bot in parenthesis after the role they fulfill: P1 (Helper), P2 (Beneficiary), and P3 (Punisher), respectively. Each violin-plot represents the distribution of standardized mean differences (SMDs) for the demographic variables presented in Table S7 across the treatment indicated in the row and column; while µ indicates the mean of SMDs. Each experimental condition is denoted by "S" that stands for sharing and "P" that stands for punishment that signals if the Helper or the Punisher is matched with the Trustor in the trust game, and the identity of participants in the third-party punishment game, either a "P," a person, or a "B," a bot in parenthesis after the role they fulfill: P1 (Helper), P2 (Beneficiary), and P3 (Punisher), respectively. Each violin-plot represents the distribution of standardized mean differences (SMDs) for the demographic variables presented in Table S7 across the treatment indicated in the row and column; while µ indicates the mean of SMDs. Each experimental condition is denoted by "S" that stands for sharing and "P" that stands for punishment that signals if the Helper or the Punisher is matched with the Trustor in the trust game, and the identity of participants in the third-party punishment game, either a "P," a person, or a "B," a bot in parenthesis after the role they fulfill: P1 (Helper), P2 (Beneficiary), and P3 (Punisher), respectively.